Pan-cancer analysis revealing that PTPN2 is an indicator of risk stratification for acute myeloid leukemia

The non-receptor protein tyrosine phosphatases gene family (PTPNs) is involved in the tumorigenesis and development of many cancers, but the role of PTPNs in acute myeloid leukemia (AML) remains unclear. After a comprehensive evaluation on the expression patterns and immunological effects of PTPNs using a pan-cancer analysis based on RNA sequencing data obtained from The Cancer Genome Atlas, the most valuable gene PTPN2 was discovered. Further investigation of the expression patterns of PTPN2 in different tissues and cells showed a robust correlation with AML. PTPN2 was then systematically correlated with immunological signatures in the AML tumor microenvironment and its differential expression was verified using clinical samples. In addition, a prediction model, being validated and compared with other models, was developed in our research. The systematic analysis of PTPN family reveals that the effect of PTPNs on cancer may be correlated to mediating cell cycle-related pathways. It was then found that PTPN2 was highly expressed in hematologic diseases and bone marrow tissues, and its differential expression in AML patients and normal humans was verified by clinical samples. Based on its correlation with immune infiltrates, immunomodulators, and immune checkpoint, PTPN2 was found to be a reliable biomarker in the immunotherapy cohort and a prognostic predictor of AML. And PTPN2'riskscore can accurately predict the prognosis and response of cancer immunotherapy. These findings revealed the correlation between PTPNs and immunophenotype, which may be related to cell cycle. PTPN2 was differentially expressed between clinical AML patients and normal people. It is a diagnostic biomarker and potentially therapeutic target, providing targeted guidance for clinical treatment.


Depicting the pan-cancer expression pattern and genomic pattern of PTPN2
In the above studies, PTPN2 is widely up-regulated of expression and significantly activates apoptosis and cell cycle pathways in pan-cancer.In CRISPR screening, PTPN2 was negatively correlated with phenotype 10 .Considering the high expression of PTPN2 in hematopoietic cell at the same time, a further study on PTPN2 was further carried out.
The expression of PTPN2 was significantly up-regulated in several cancers (Fig. 2a), it was significantly expressed in almost all types of cancer, especially LAML, ESCA, ALL, and CLL (Fig. 2b).Moreover, PTPN2 protein was highly expressed in tumor tissues according to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) project (Fig. S3a).This reveals the correlation between PTPN2 and cancer occurrence to a certain extent.Then, The Human Protein Atlas (HPA) database was used to evaluate the RNA and protein expression of PTPN2 in various organs.The results indicated that PTPN2 was highly expressed in immune-related organs such as lymph node, tonsil, bone marrow, and thymus (Fig. S3c).This was consistent with Harmonizome database results, suggesting that the link between PTPN2 and the immune system and blood system was robust.In addition, PTPN2 was almost not expressed in adipose tissue (Fig. S3d).
Next, we evaluated the expression of PTPN2 in paired tumorous and normal samples and the results indicated that PTPN2 was differentially expressed in normal and tumor tissues in 13 types of cancer and was generally upregulated in tumor tissues (Fig. S3b).By obtaining subcellular localization of PTPN2 in the HPA, PTPN2 was found in the nucleoplasm and was highly expressed in almost all cancer cell lines including SiHa, U-2 OS, U-251 MG, A-431, and CACO-2 (Fig. S3e).The expression levels of PTPN2 across cell lines and tissues were analyzed based on the Harmonizome database.The results indicated that PTPN2 was widely expressed in whole body organs.Notably, PTPN2 expression was highest in lymphocytes and blood (Fig. S3f.).Its expression in different tissues of different organs and different systems was analyzed.The results indicated that PTPN2 was widely expressed in immune system, especially in thymus tissues (Fig. S3g).Broadly speaking, mutation and amplification are the most common mutation types in PTPN2, accounting for all in some cancers (Fig. 2c).Lollipop chart indicated that PTPN2 mutations mainly occurred in Y_phosphatase between 0 and 415 amino acids, with missense as the dominant type of mutation.For different Exons, mutations mainly occurred in Exon 8, and the mutation site S298C/F/Y had the highest mutation frequency (Fig. S4a).Analysis results demonstrated that the type of CNV is correlated with PTPN2 expression.Among these, amplification has the highest mRNA expression level (Fig. S4c).PTPN2 variants remained mostly to be shallow www.nature.com/scientificreports/deletions, while the peak of mutation count appeared at the endometrial cancer and melanoma (Fig. S4b).Fraction Genome Altered (FGA) of PTPN2 in 30 types of cancer was detected, and the results indicated that shallow deletions are widespread in many cancer types (Fig. S4d).MMR is an intracellular mismatch repair mechanism, and the loss of function of key genes in this mechanism will lead to the failure of DNA replication errors to be repaired, resulting in a higher rate of somatic mutation, and it's a potential cancer driver 11 .Therefore, the correlations between the expression of PTPN2 and five important MMRs related genes (MLH1, MSH2, MSH6, PMS2 and EPCAM) in pan-cancer were analyzed, which indicated that PTPN2 was significantly correlated with MMRs genes in 29 types of cancer.In these tumors, MLH1, MSH2, MSH6 and PMS2 were positively correlated with PTPN2, suggesting that PTPN2 may play a role in tumors through the regulation of MMRs process (Fig. S4e).
RNA modification can directly affect the chemistry of RNA and thus affect cancer progression 9 .Therefore, 44 RNA modification regulators of three types of cancer-related RNA modifications were collected, including N6-methyladenosine (m6A), N1-methyladenosine (m1A), and 5-methylcytosine (m5C), and the correlation with PTPN2 expression was analyzed.The results showed that the expression of PTPN2 was significantly positively correlated with RNA modified genes in most cancers.This suggests that PTPN2 expression potentially affects the RNA modification process in pan-cancer (Fig. 2e).The expression of four methylated transferases (DNMT1, DNMT2, DNMT3A and DNMT3B) in various tumor types was significantly correlated with the expression of PTPN2.Notably, the co-expression coefficients of SKCM and STAD were significantly higher (Fig. 2d).

PTPN2 expression is related to ICP, immunomodulatory genes, and immune infiltration levels in pan-cancer
To explore the main pathways through which PTPN2 exerts immunomodulatory effects, the samples were grouped according to the PTPN2 expression.Then DEGs among groups were screened and subjected to GSEA to analyze the correlation between PTPN2 expression and cancer-related pathways.The results indicated that the DEGs were mainly enriched in cell cycle-related pathways (MYC, and E2F), Inferon γ response, Inflammatory response, epithelial mesenchymal transition, allograft rejection, and oxidative phosphorylation (Fig. 3a).
Previous studies proved that ICPs were important for maintaining self-tolerance and preventing excessive immune responses that could cause damage to healthy tissues.However, some cancer cells could take advantage of these checkpoints to evade the immune system's attack 12 .Therefore, we investigated the correlations between the expression level of ICPs and PTPN2 in pan-cancer to characterize the potential role of PTPN2 in immunotherapy and the results indicated that the expression of PTPN2 was significantly positively correlated with ICPs in most cancers (Fig. 3b), which suggested that PTPN2 might coordinate the activity of ICPs in different pathways and might be considered an ideal immunotherapeutic marker.We also examined the correlations between the expression of various immunomodulatory genes, such as chemokine receptors, MHC molecules, immune-inhibitors, and immune-stimulators, and PTPN2 expression.The results showed a significant positive correlation between PTPN2 expression and immunomodulatory genes in most types of cancer (Fig. S5a).
We used TIMER2.0 to analyze the correlation between PTPN2 expression and immune cell infiltration in pan-cancer.The results showed a positive correlation between PTPN2 expression and various immune infiltrates, including common lymphoid progenitor, T cell follicular helper, myeloid-derived suppressor cells, B cell, neutrophil, monocyte, macrophage, myeloid dendritic cell, and CD8 + T cell, and a negative correlation with common myeloid progenitor, endothelial cell, hematopoietic stem cell, and NKT cell (Fig. 3c).PTPN2 was found to be involved in immune infiltration and played an important role in immune-tumor interaction.It should be noted that the trend of this correlation was different in THYM, especially B cell and Macrophage, which may be correlated to the different tumor microenvironment 13 .PTPN2 was also found to be widely correlated with ESTIMATE score, immune score, and stromal score in pan-cancer (Fig. S5b-d) The findings indicated that PTPN2 played a vital role in immune infiltrates in pan-cancer and has the potential to serve as a response indicator in clinical practice.

Exploring the immunotherapy response, prognostic correlation, drug sensitivity, and predictive power of PTPN2 in pan-cancer
We collected survival data from TCGA and TARGET data portals to evaluate the prognostic value of PTPN2 in pan-cancer using CoxPH and log-rank test.PTPN2 expression was found to be a reliable biomarker in a wide range of cancer types, and was significantly correlated with overall survival in AML (Fig. 4a,b; Fig. S6a-c).
To explore the promising value of PTPN2 as a novel immune target, the immunotherapy response and sensitive drugs among different PTPN2 expression were compared.The results revealed that there were significant differences in PTPN2 expression among 12 murine immunotherapy cohorts (Fig. S6d).Among them, IFN-γ or TNF-α treated mice were more likely to have elevated PTPN2 levels, while TGF-β1 treated mice were more likely to have low expression of PTPN2.The results also revealed that PTPN2 could significantly predict immunotherapy response in 5 murine immunotherapy cohorts, which responders were more likely to have elevated PTPN2 levels (Fig. 4c).Additionally, PTPN2 was closely correlated to the efficacy of immunotherapy such as CAR-T, PD-L1, Anti-PD-1 and Anti-CTLA-4, indicating the potential of PTPN as an immunotherapy biomarker (Fig. S7).
Vol:.( 1234567890 Then, comparisons between PTPN2 expression and other published biomarkers based on their predictive power of immunotherapy response were performed, it was found that PTPN2 had an AUC of more than 0.5 The positive (Fig. 4e) and negative (Fig. S6e) correlation between drug sensitivity and PTPN2 expression was analyzed.The data suggested that PTPN2 might be correlated with chemical resistance to some commonly used antitumor drugs in clinical practice, such as Nelarabine, Fludarabine and Hydroxyurea.Among them, Nelarabine and Fludarabine can have certain curative effect on leukemia, suggesting that PTPN2 is closely correlated to drug resistance in patients with hematologic tumors.

Elucidating the correlation between PTPN2 and microenvironment in AML
In the above studies, potential correlations between PTPN2 and hematological diseases were found, thus, it is necessary to further explore the expression pattern of PTPN2 in AML.PTPN2 was differentially expressed in different cell species in GSE116256 (Fig. 5a, b), GSE135851, and GSE154109 (Fig. S9).The results indicated that PTPN2 was highly expressed in CD8 + Tex cell, plasma cell, monocyte and promonocyte, and low expressed in CD4 + T cell and hepatic stellate cell.Moreover, PTPN2 was strongly correlated with Immune response (Fig. S8a).
Results of qRT-PCR confirmed the expression level of PTPN2 in the bone marrow samples of 21 AML patients was significantly higher than that of 10 normal donors (Fig. 5c).The prognostic role of PTPN2 in the TCGA-LAML cohort was also investigated, suggesting that PTPN2 is a diagnostic biomarker for AML and may also be a prognostic marker (Fig. 5d).We collected 87 signatures related to TME and tumor phenotypes from the IOBR package and analyzed the correlation between PTPN2 expression and these signatures.The results showed that PTPN2 expression had extensive and consistent positive or negative correlations with the signatures in TCGA (Fig. S8b,c) and beatAML (Fig. 5e,f) cohorts, indicating its close association with TME.

Drug discovery of PTPN2 in AML
The drug sensitivity of data, obtained from CCLE in CTRP and the PRISM, indicated that patients with low PTPN2 expression were highly sensitive to three CTRP-derived compounds and six PRISM-derived compounds (Fig. 6a), high PTPN2 expression were highly sensitive to five CTRP-derived compounds (panobinostat, ouabain, neuronal differentiation inducer III, BRD-K61166597 and B02) and three PRISM-derived compounds (romidepsin, RGFP966, and imidapril) (Fig. S10).After that, the difference between high and low PTPN2 expression groups was analyzed to obtain the molecular characteristics of the disease for cMAP analysis.Results showed five compounds were identified to be mostly correlated with PTPN2 expression characteristics (Fig. 6b).Among them, mercaptopurine was often used as a therapeutic agent for leukemia, indicating that the correlation between PTPN2 and hematological cancers was robust.

Derivation, construction and validation and characterization of PTPRS
The study found that PTPN2 expression in AML had a significant effect on biological characteristics and clinical outcomes.Due to the complexity of PTPN2 expression, a PTPRS was constructed to approximate and simplify it.We formed metaX cohorts (n = 953) and metaY cohorts (n = 771) as mentioned above, and then 953 samples in the metaX cohort were randomly grouped into training set (n = 669) and validation set (n = 284) in a ratio of 7:3.Using Cor.test function, 8667 PTPN2 related genes in metaX and 5081 genes in metaY were screened, and 1824 genes were obtained by intersection.Then, in the training set, 826 prognostic related genes were screened out by Kaplan-Meier analysis, 30 of which were selected as effective candidate genes by univariate and LASSO CoxPH  S3).In the training group, patients in the training set, validation set, and other external validation cohorts were grouped into high-risk and low-risk groups based on the median of score.Patients in the low-risk groups had longer survival time than those in the high-risk groups, suggesting that PTPRS is a reliable prognostic indicator in all AML cohorts (Fig. 7a).To test the reliability of the PTPRS, patients in TARGET-AML and TCGA-LAML were combined into overall CP cohort (n = 516) and the prediction abilities of the PTPRS and five existing prediction systems were compared, and the results suggested that PTPRS has better prediction ability (Fig. 7b,c).In addition, results of univariate (Fig. S11c) and multivariate CoxPH (Fig. S11d) in the TCGA-LAML and TARGET-AML cohort suggested that PTPRS remained a significantly and independently prognostic factor after adjusting for other clinical factors.Taken together, these results validated the good prognostic efficiency of PTPRS.75 immunomodulatory genes were collected to analyze their expression, methylation and mutation characteristics in high-risk and low-risk groups.In the high-risk group, immunomodulatory genes expression had a more significant positive correlation with methylation level; and in the low-risk group, immunomodulatory genes amplification and deletion frequency was higher (Fig. S11e).At the same time, it was found that the high-risk group had higher immune checkpoint target-related gene expression, immune score, and immune cell infiltration levels (Fig. S11f.).Based on different immune cell algorithms, the high-risk and low-risk group had different immune infiltration conditions, indicating that the high-risk and low-risk group may had different immune microenvironments (Fig. 7d).We analyzed infiltration levels in 27 cells in the high-risk and low-risk group.The result indicated that high-risk group had significantly higher levels of immune cell infiltration (Fig. S11g).The high and low risk groups had different mutation frequencies.Except for RUNX1, IDH2, and KRAS, the high-risk group had lower mutation frequencies than the low-risk group (Fig. S11h).
By applying GO and KEGG pathway enrichment analysis, the results indicated that biological processes (Fig. 7e) were mainly enriched in cell migration and cell cycle related pathways, cellular components (Fig. S11i) were mainly enriched in ribosome, lysosome and respiratory chain related pathways, and molecular function (Fig. S11j) was mainly enriched in cytokine and immune receptor related pathways.DEGs were significantly enriched in four KEGG terms (Graft-versus-host disease, Viral protein interaction with cytokine and cytokine receptor, ECM-receptor interaction, and Cytokine-cytokine receptor interaction) (Fig. 7f).To further investigate the potential differences between high-risk and low-risk groups, GSEA enrichment analysis was performed for differential genes.The results indicated that the high-risk group was significantly enriched in cancer-related signaling pathways (Fig. 7g).

Clinical values of PTPRS and construction of nomogram
The risk category in the high-risk group tended to be Poor (p < 0.001), generally older (p < 0.001), and had a worse prognosis (p < 0.001), while there was no statistical difference in cytogenetic abnormality.And there were significant differences in pathological stage between the high-risk and low-risk group (Fig. 8a, S12a).Among 37 cancer drugs, the high-risk group had a lower IC50 value, indicating a higher sensitivity to the drug (Fig. S12b).To further optimize the prediction effect of PTPRS, a nomogram containing important predictors in CoxPH was established to predict the prognosis of AML (Fig. 8b).For example, patients with AML had a risk category of favorable (0 points), an age of 66 years old (28 points), no cytogenetic abnormalities (0 points), and a PTPRS of 4 (20 points).Therefore, with a total score of 48, the 1-year survival rate is about 49%, 2-year survival rate is about 31%, and 3-year survival rate is about 7%, respectively.Calibration curves showed good agreement between the predicted and observed OS at 1-, 2-, and 3-year in the training and validation cohorts (Fig. 8c).DCA was performed to compare the clinical applicability of the nomogram with PTPRS.Result indicated that the nomogram could better predict OS at 1-, 2-, and 3-year because it added more net clinical benefit compared to PTPRS and other pooled models (Fig. 8d,e).The time dependent AUC curve of OS state was plotted.Changes in AUC over time indicated that nomogram was slightly better than PTPRS in predicting prognosis (Fig. 8f).

Discussion
Intracellular non-receptor PTPs, the largest of the cysteine PTP family, are critical for the regulation of a variety of biological processes, including but not limited to hematopoietic, inflammatory response, immune system and glucose homeostasis, and play an important role in the occurrence, development, metastasis and drug resistance of tumors 5 .However, comprehensive analysis of PTPNs is still missing at the pan-cancer level, especially in AML, and most studies focus on proving the clinical value of PTPN1 expression and PTPN11 mutation in AML 14,15 .It was found that PTPN2 is highly expressed in hematopoietic cells and plays a negative signaling role 16 .Noteworthily, the PTPN2 catalytic domain shared 74% sequence homology and similar enzyme kinetics with another family member, PTPN1 6 .These results suggest that PTPN2 may play an important role in AML.
In this study, the expression and mutation status of PTPNs in pan-cancer were analyzed, finding that PTPN2 is working as a driver of AML.The study also revealed a strong effect of PTPNs on cell cycle and verified PTPN2 as a diagnostic biomarker for patients with AML at clinic level.Finally, PTPRS was developed to predict the prognosis and response of cancer immunotherapy, and a nomogram with better efficacy was constructed combined with clinical indicators.Targeting PTPNs has always been a crucial approach for treating diseases.According to reports, PTPN1 is involved in the development of many diseases, including obesity, diabetes, cancer, and cardiovascular disease 17 .PTPN1 and PTPN2 inhibitors have been developed and have become emerging means to enhance T cell antitumor immunity 18 .PTPN3 is a potential immune checkpoint inhibitor target that may mediate T cells, while PTPN5 and PTPN7 can specifically inactivate MAPKs, so the developed inhibitors may have therapeutic potential for treating neurodegenerative diseases in AML patients 19,20 .Targeting PTPN6 is an effective treatment for combating diabetes 21 .PTPN11 has always been a focus of attention in the field of human diseases, especially cancer, and can bind to multiple immune inhibitory receptors and inhibit the activation of immune cells 22 .PTPN11 regulates numerous cascade pathways, such as RAS-RAF-ERK, JAK-STAT, JAK-STAT, and is closely associated with immunotherapy response 23 .PTPN12 is considered a promising therapeutic target for critical diseases such as cancer, diabetes, metabolic diseases, and autoimmune diseases and has been used for therapeutic intervention in acute myocardial infarction 24,25 .PTPN13 and PTPN23 act as tumor suppressors in various tumors [26][27][28][29] .PTPN22 inhibitors have enormous potential to enhance the efficacy of current immunotherapy strategies 30 .However, there are still gaps in the development of targeted drugs for PTPN13, PTPN14, PTPN18, PTPN21, and PTPN23.
Immunotherapy was first identified as an effective treatment for tumors by Wilhelm Bush and Friedrich Fehleisen in the nineteenth century 31 .In recent years, monoclonal antibodies targeting specific targets on tumor cells have been widely used to treat hematological malignancies, either in combination with chemotherapy or as a single agent 32 .PTPN11 is an effective target for the treatment of hematological malignancies and can also bind to various immune inhibitory receptors 22,33 .Considering that PTPN2 and PTPN11 belong to the same family of PTPs, the combination of PTPN2 inhibitors with immune therapy is a promising strategy.The role of PTPN2 as a biomarker in tumor microenvironments was systematically studied.It was found that PTPN2 was strongly correlated to six tumor stemness indexes in many cancers.PTPN2 and RNA modification regulators, ICP, immunomodulatory genes, and mismatch repair related genes were positively correlated.
But in some studies, the absence of PTPN2 in B16 tumours does not produce significant differences in bone marrow cell infiltration 10 .This may be related to the different tumor microenvironments of melanoma and AML, suggesting that PTPN2 may not act as a therapeutic target in all hematologic tumors, but its role in AML is indispensable.
The use of public databases and computational models to identify optimal personalized therapeutic agents and drug combinations has become increasingly popular 34 .In this research, the biomarker correlation and predictive power of PTPN2 in 25 immunotherapy cohorts were analyzed.At the same time, predictions for sensitive drugs have been made based on PTPN2 expression in multiple databases.We found that in 12 immunotherapy cohorts, PTPN2 alone had an AUC of over 0.5, with a higher predictive value than TMB, T.lonality, and B.lonality in the immunotherapy cohort.More importantly, the differential expression of PTPN2 was verified in the bone marrow of patients and normal subjects using clinical samples, and a series of targeted small molecule drugs with good therapeutic effects are predicted in this paper, providing guidance for clinical drug use.Noteworthily, PTPN2 inhibitors have been successfully developed 6 , so the application of PTPN2 inhibitors combined with Immunotherapy in AML has promising potential.
Finally, PTPRS was developed and validated, which is common-used and productive in external validation queues.It has the advantage of combining multiple AML high-throughput sequencing cohorts, and clinical indicators were combined with PTPRS to establish a nomogram with better predictive power.
There are still some limitations to the study.First, although this study largely corrected batch effect across multiple cohorts, the implications for genomic analysis should be further analyzed using larger data sets from multiple databases.Secondly, it is still required to verify the effect of PTPNs on the cell cycle by further experiments.Finally, the predictive effectiveness of PTPRS is not validated in self-tested cohorts.

Data retrieval, collection, and preprocessing
Firstly, The pan-cancer RNA sequencing (RNA-seq) data (FPKM value) and the corresponding survival information of The Cancer Genome Atlas (TCGA) 35 were extracted from the UCSC Xena Browser (https:// xena.ucsc.edu/) 36 .Full names and abbreviations of all cancers are listed in Table S1.
Next, transcriptome information of 151 patients in the TCGA-LAML cohort, 187 patients in the TARGET-AML 37 , 450 patients in the beatAML 38  www.nature.com/scientificreports/TCGA-LAML was gathered and processed using the GISTIC 2.0 algorithm 39 , and somatic mutation profiles (Varscan) was obtained as the mutation annotation format (MAF) format by the R package "maftools" 40 .
For the TCGA-LAML, TARGET-AML, and beatAML cohorts, the FPKM values were converted into TPM values for consistency, and further subjected to log2(x + 1) transformation for normalization.For all cohorts obtained from the GEO, "normalizeBetweenArrays" function in the R package "limma" was used for normalization 52 .
Then, the batch effect was corrected with the "removeBatchEffect" function in "limma" 52 , and their gene expression data was ultimately standardized via Min-Max normalization for downstream multi-database analysis.

Deciphering pan-cancer expression pattern of PTPNs
We systematically analyzed the expression levels of PTPNs (protein tyrosine phosphatases) between tumorous and adjacent normal tissues at the pan-cancer level using the ONCOMINE 53 , TIMER 54 and the TCGA-Pan-Cancer atlas.
Additionally, we investigated the mRNA and protein levels of PTPN2 expression in normal or tumor tissues and normal or tumor cell lines using the CCLE, Human Protein Atlas (HPA: https:// www.prote inatl as.org/), harmonizome (https:// maaya nlab.cloud/ Harmo nizome/), and BioGPS portal 57,58 .Furthermore, we deciphered the pan-cancer expression pattern of PTPN2 in the single-cell level using TISCH 59,60 .We utilized the cBioPortal (https:// www.cbiop ortal.org/) to depict the pan-cancer genomic landscape of PTPNs in terms of CNV and single nucleotide polymorphisms (SNPs) 61 .The role of PTPNs in diseases, systematic drug-target identification and prioritization was analyzed preliminarily based on underlying evidence through the Open Targets Platform (https:// www.opent argets.org/) 62 .Thanks to GSCA platform for visualizing and demystifying the pan-cancer phenotypic characteristics of PTPNs, this study analyzed the correlation between PTPNs and prognosis or clinical subtypes at the pan-cancer level 63 .Additionally, the relationship between PTPN2 expression and clinical outcomes, including overall survival (OS), progression-free interval (PFI), disease-free interval (DFI) and disease-specific survival (DSS), was analyzed and visualized at the pan-cancer level with the help of Sangerbox 64 Platform (https:// vip.sange rbox.com/ home.html) 64 .
Integrated cell line datasets with drug sensitivity information were extracted from the Genomics of Drug Sensitivity in Cancer (GDSC) 77 , and predicted sensitivity of chemotherapeutic treatment was inferred using oncoPredict 78 .The Cancer Therapeutics Response Portal (CTRP) 79 and profiling of relative inhibition simultaneously in mixtures (PRISM) database 80 were used to analyze drug sensitivity relationships between PTPN2 and chemotherapeutic agents.Special thanks are given to Chen Yang for his support with R-script design and splendid analysis methodologies 81 , and Paul Geeleher for development of the R package "pRRophetic" 82 .
Correlation between PTPN2 and drug sensitivity was also analyzed using CellMiner, and CMap score was calculated to predict potential drugs reversing the molecular features of the disease [83][84][85] .Additionally, PTPN2 expression was analyzed between response and non-response groups in cohorts receiving immunotherapy.

RNA extraction and qRT-PCR
Total RNA was extracted from bone marrow samples from AML patients and normal individuals using Trizol reagent (Invitrogen, Carlsbad, CA, U.S.A.) according to the manufacturer's protocol.Superscript II reverse transcriptase and random primers were used to synthesize cDNA.Quantitative real-time PCR (qRT-PCR) was performed on the ABI 7900HT Sequence Detection System with SYBR-Green dye (Applied Biosystems, Foster City, CA, U.S.A.).All primers are listed in Table S2.Expression levels of PTPN2 were calculated using the 2-ΔΔCT method.

Construction and validation of the prediction model
Least absolute shrinkage and selection operator (LASSO) penalized Cox proportional hazards regression model (CoxPH) 86 with tenfold cross-validation was used to construct a PTPRS for the prognostication of patients 87 .The PTPRS for individual patients was calculated as follows: (gene's expression × coefficient).Univariate and multivariate CoxPHs were used to evaluate the prognostic value of PTPRS.To further unravel the underlying biological mechanisms relating to the PTPRS, differentially expressed genes (DEGs) between different risk groups was screened using DESeq2 package 88 .Functional enrichment analyses (GO and KEGG pathways) as well as GSEA on the DEGs were conducted and visualized via the R package "clusterProfiler 4.0" 89 and "GOplot" 90 .Kaplan-Meier curves, time-dependent receiver operating characteristic curve (ROC) analysis, decision curve analysis (DCA) and concordance index (C-index) curves were used to evaluate the prognostic role the DEGs by the R package "pROC" and "pec" 91 .Finally, we compared the prediction ability of the PTPRS with other five prognostic [92][93][94][95][96] models, but also confirmed as an independent prognostic factor in contrast to other clinical biomarkers via multivariate CoxPH.

Underlying microenvironment between samples in high-and low-risk group
The immune profile was visualized via heatmap, displaying expression of ICP, abundance of 24 immunocyte infiltration, immune score, stromal score and DNA methylation of tumor-infiltrating lymphocytes (MeTILs) 97 .
Then, the differences of 75 immunomodulators between two subtypes were further analyzed at the multiomics level: mRNA expression, gene expression correlation with DNA-methylation beta-value, amplification frequency (the difference between the fraction of samples in which an immunomodulator was amplified in a particular subtype and the amplification fraction in all samples) and deletion frequency.
Finally, ssGSEA algorithm, which enabled us to quantify the absolute enrichment of various TME infiltration cells via the immune deconvolution analyses, was implemented to investigate the differences of 34 immunocytes in distinct subtypes.

Establishment, superiority and validation of a nomogram speculating prognosis
To further quantify the predictive performance of PTPRS, we constructed a nomogram based on the training set and integrated the PTPRS and other clinical features of patients using the R package "rms" and the performance of the nomogram was validated and calibrated using DCA, and time-independent ROC analysis in the training, validation and metaX cohorts.

Statistical analysis
Group differences were evaluated using Mann-Whitney U test.Correlations between variables were analyzed with Pearson's or Spearman's correlation analysis as appropriate.Outcomes with P < 0.05 were defined to be statistically significant in comparisons between groups.The R (version: 4.1.3)was used for data processing and statistical analyses.

Figure 1 .
Figure 1.Pan-cancer analysis of PTPNs.(a) Differential expression of PTPNs.(b) The constitute of the Heterozygous/Homozygous CNV of PTPNs in pan-cancer.(c) The mutation distribution of the top 10 mutated genes in PTPNs and a SNV classification of SNV types.(d) The percentage of cancers in which PTPNs expression has potential effect (FDR < = 0.05) on pathway activity.(e) Priority of PTPNs among four immunosuppressive indices, including the T-cell dysfunction levels, ICB response outcome, phenotypes in CRISPR screens, and T-cell exclusion cell types.(f) Correlation between PTPNs expression and drug IC50.

Figure 3 .
Figure 3. Association of PTPN2 with cancer pathways and immune processes.(a) Immunophenotypes Enrichment analysis for metabolism pathway and cancer signaling between high and low PTPN2 expression.(b) Correlation between PTPN2 and ICP.(c) Correlation between PTPN2 expression and immune infiltration in pan-cancer.*P < 0.05.

Figure 4 .
Figure 4. Prognostic value and biomarker potential of PTPN2.(a) Effect of PTPN2 on cancer prognosis.(b) Correlation between PTPN2 expression and overall survival in the TCGA and TARGET cohort.(c) A Comparison of PTPN2 expression before and after ICB treatments across different tumor models in vivo.(d) Ability of PTPN2 to predict response outcome and overall survival in immunotherapy cohorts.(e) The top 12 drugs positively correlated with PTPN2 expression in the CellMiner database.

Figure 5 .
Figure 5. Prognostic value and biomarker potential of PTPN2.(a, b) The expression of PTPN2 in different cell types.(c) Expression level of PTPN2 in the bone marrow samples of AML patients and normal donors.(d) The prognostic role of PTPN2 in the TCGA-LAML cohort.(e, f) Correlation between PTPN2 expression and cancer microenvironment-related signatures in TCGA-LAML.*P < 0.05, **P < 0.01, ***P < 0.001.

Figure 6 .Figure 7 .
Figure 6.Drug discovery of PTPN2 in AML.(a) Drug screening in patients with high PTPN2 expression in CTRP and PRISM.(b) CMAP analysis between high and low PTPN2 expression groups.

Figure 8 .
Figure 8. Clinical association of PTPRS and construction of nomogram.(a) The difference in clinicopathologic features and pathological stages of AML between high-risk and low-risk group.(b) Nomogram predicting the 1-, 2-, and 3-year OS in patients with AML.(c) The calibration curves for predicting patient OS at a 1-, 2-, and 3-year.(d, e) DCA curves of the nomogram, PTPRS and other pooled models for predicting 1-, 2-, and 3-year OS.(f) Time-dependent AUC values of nomogram and PTPRS for the prediction of OS.